-
Notifications
You must be signed in to change notification settings - Fork 25
Add tests for maximal reconvergence. #473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
s-perron
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good couple tests to starts testing convergence. However, the feature checks need to be improved to make sure they run for DirectX as well.
557f48f to
1251f6e
Compare
|
Looks like Feature/MaximalReconvergence/subgroup_uniform_control_flow.test failing for DirectX in different way. Intel clang DirectX: Data: [ 8, 9, 8, 9, 14, 7, 7, 7 ] @spall What is the best way to handle this failure? I'm guessing there are two problems. One in warp and one in the clang compiler. |
so dxc and clang have different results and its determined that the dxc results are correct? Also warp is wrong always? I guess we file an issue on Clang and Warp. @llvm-beanz how should we file an issue on warp for this? |
Co-authored-by: Steven Perron <stevenperron@google.com>
|
test/Feature/MaximalReconvergence/subgroup_uniform_control_flow.test
Outdated
Show resolved
Hide resolved
1242a59 to
32cbead
Compare
32cbead to
a9a1d46
Compare
…en the loop contains control convergence operations. (#165643) Skip constant folding the loop predicates if the loop contains control convergence tokens referenced outside the loop. Fixes #164496. Verified [loop_peeling.test](llvm/offload-test-suite#473) passes with the fix. Similar control convergence issues are found on other passes. #165642 HLSL used for tests: ```hlsl RWStructuredBuffer<uint> Out : register(u0); [numthreads(8,1,1)] void main(uint3 TID : SV_GroupThreadID) { for (uint i = 0; i < 8; i++) { if (i == TID.x) { Out[TID.x] = WaveActiveMax(TID.x); break; } } } ``` With nested loop: ```hlsl RWStructuredBuffer<uint> Out : register(u0); [numthreads(8,8,1)] void main(uint3 TID : SV_GroupThreadID) { for (uint i = 0; i < 8; i++) { for (uint j = 0; j < 8; j++) { if (i == TID.x && j == TID.y) { uint index = TID.x * 8 + TID.y; Out[index] = WaveActiveMax(index); break; } } } } ```
…edicates when the loop contains control convergence operations. (#165643) Skip constant folding the loop predicates if the loop contains control convergence tokens referenced outside the loop. Fixes llvm/llvm-project#164496. Verified [loop_peeling.test](llvm/offload-test-suite#473) passes with the fix. Similar control convergence issues are found on other passes. llvm/llvm-project#165642 HLSL used for tests: ```hlsl RWStructuredBuffer<uint> Out : register(u0); [numthreads(8,1,1)] void main(uint3 TID : SV_GroupThreadID) { for (uint i = 0; i < 8; i++) { if (i == TID.x) { Out[TID.x] = WaveActiveMax(TID.x); break; } } } ``` With nested loop: ```hlsl RWStructuredBuffer<uint> Out : register(u0); [numthreads(8,8,1)] void main(uint3 TID : SV_GroupThreadID) { for (uint i = 0; i < 8; i++) { for (uint j = 0; j < 8; j++) { if (i == TID.x && j == TID.y) { uint index = TID.x * 8 + TID.y; Out[index] = WaveActiveMax(index); break; } } } } ```
…en the loop contains control convergence operations. (llvm#165643) Skip constant folding the loop predicates if the loop contains control convergence tokens referenced outside the loop. Fixes llvm#164496. Verified [loop_peeling.test](llvm/offload-test-suite#473) passes with the fix. Similar control convergence issues are found on other passes. llvm#165642 HLSL used for tests: ```hlsl RWStructuredBuffer<uint> Out : register(u0); [numthreads(8,1,1)] void main(uint3 TID : SV_GroupThreadID) { for (uint i = 0; i < 8; i++) { if (i == TID.x) { Out[TID.x] = WaveActiveMax(TID.x); break; } } } ``` With nested loop: ```hlsl RWStructuredBuffer<uint> Out : register(u0); [numthreads(8,8,1)] void main(uint3 TID : SV_GroupThreadID) { for (uint i = 0; i < 8; i++) { for (uint j = 0; j < 8; j++) { if (i == TID.x && j == TID.y) { uint index = TID.x * 8 + TID.y; Out[index] = WaveActiveMax(index); break; } } } } ```
…en the loop contains control convergence operations. (llvm#165643) Skip constant folding the loop predicates if the loop contains control convergence tokens referenced outside the loop. Fixes llvm#164496. Verified [loop_peeling.test](llvm/offload-test-suite#473) passes with the fix. Similar control convergence issues are found on other passes. llvm#165642 HLSL used for tests: ```hlsl RWStructuredBuffer<uint> Out : register(u0); [numthreads(8,1,1)] void main(uint3 TID : SV_GroupThreadID) { for (uint i = 0; i < 8; i++) { if (i == TID.x) { Out[TID.x] = WaveActiveMax(TID.x); break; } } } ``` With nested loop: ```hlsl RWStructuredBuffer<uint> Out : register(u0); [numthreads(8,8,1)] void main(uint3 TID : SV_GroupThreadID) { for (uint i = 0; i < 8; i++) { for (uint j = 0; j < 8; j++) { if (i == TID.x && j == TID.y) { uint index = TID.x * 8 + TID.y; Out[index] = WaveActiveMax(index); break; } } } } ```
…en the loop contains control convergence operations. (llvm#165643) Skip constant folding the loop predicates if the loop contains control convergence tokens referenced outside the loop. Fixes llvm#164496. Verified [loop_peeling.test](llvm/offload-test-suite#473) passes with the fix. Similar control convergence issues are found on other passes. llvm#165642 HLSL used for tests: ```hlsl RWStructuredBuffer<uint> Out : register(u0); [numthreads(8,1,1)] void main(uint3 TID : SV_GroupThreadID) { for (uint i = 0; i < 8; i++) { if (i == TID.x) { Out[TID.x] = WaveActiveMax(TID.x); break; } } } ``` With nested loop: ```hlsl RWStructuredBuffer<uint> Out : register(u0); [numthreads(8,8,1)] void main(uint3 TID : SV_GroupThreadID) { for (uint i = 0; i < 8; i++) { for (uint j = 0; j < 8; j++) { if (i == TID.x && j == TID.y) { uint index = TID.x * 8 + TID.y; Out[index] = WaveActiveMax(index); break; } } } } ```
Added 2 tests based on the article, subgroup uniform control flow and loop peeling.
Addresses llvm/llvm-project#136930